A Distributed Information Divergence Estimation over Data Streams
نویسندگان
چکیده
منابع مشابه
Density Estimation over Data Streams
A growing number of real-world applications share the property that they have to deal with transient data arriving in massive volumes, so-called data streams. The characteristics of these data streams render their analysis by means of conventional techniques extremely difcult, in the majority of cases even impossible. In fact, to be applicable to data streams, a technique has to meet rigid proc...
متن کاملMonitoring frequent items over distributed data streams
MONITORING FREQUENT ITEMS OVER DISTRIBUTED DATA STREAMS Robert H. Fuller April 3, 2007 Many important applications require the discovery of items which have occurred frequently. Knowledge of these items is commonly used in anomaly detection and network monitoring tasks. Effective solutions for this problem focus mainly on reducing memory requirements in a centralized environment. These solution...
متن کاملDistinct-Values Estimation over Data Streams
In this chapter, we consider the problem of estimating the number of distinct values in a data stream with repeated values. Distinctvalues estimation was one of the first data stream problems studied: In the mid-1980’s, Flajolet and Martin gave an effective algorithm that uses only logarithmic space. Recent work has built upon their technique, improving the accuracy guarantees on the estimation...
متن کاملDistributed Continuous Data Aggregation Over Web Service Event Streams
We present a distributed platform for continuous event-based aggregation of Web services and data. The platform both actively monitors Web services and receives XML events using WS-Eventing. An aggregation is composed of multiple inputs such as Web service invocations and event subscriptions, which are formulated and processed in a query language based on XQuery. Our query model allows to speci...
متن کاملFeature Selection over Distributed Data Streams through Convex Optimization
Monitoring data streams in a distributed system has attracted considerable interest in recent years. The task of feature selection (e.g., by monitoring the information gain of various features) requires a very high communication overhead when addressed using straightforward centralized algorithms. While most of the existing algorithms deal with monitoring simple aggregated values such as freque...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Parallel and Distributed Systems
سال: 2014
ISSN: 1045-9219
DOI: 10.1109/tpds.2013.101